Evaluating the Significance of Global and Local Features in Expressed Sequence Tag: A Clustering Quality Perspective
نویسندگان
چکیده
Clustering of expressed sequence tag (EST) plays an important role in gene analysis. Alignment-based sequence comparison is commonly used to measure the similarity between sequences, and recently some of the alignment-free comparisons have been introduced. In this paper, we evaluate the role of global and local features extracted from the alignment free approaches i.e., compression-based method and generalized relative entropy method, in the quality of EST clustering perspective. Our evaluation shows that the local feature of EST yields much better clustering result compares to the global feature. Index Terms sequence clustering, alignment-free, similarity measure, grammar-based distance, generalized relative entropy
منابع مشابه
A Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features
Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...
متن کاملA Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features
Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...
متن کاملEvaluation of Updating Methods in Building Blocks Dataset
With the increasing use of spatial data in daily life, the production of this data from diverse information sources with different precision and scales has grown widely. Generating new data requires a great deal of time and money. Therefore, one solution is to reduce costs is to update the old data at different scales using new data (produced on a similar scale). One approach to updating data i...
متن کاملLink Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملخوشهبندی بیماران مبتلا به کمخونی با رویکرد دادهکاوی
Background: Anemia disease is the most common hematological disorder which most often occurs in women. Knowledge discovery from large volumes of data associated with records of the disease can improve medical services quality by data mining The goal of this study was to determining and evaluating the status of anemia using data mining algorithms. Methods: In this applied study, laboratory an...
متن کامل